local latent variable
Robust Variational Bayes by Min-Max Median Aggregation
Yan, Jiawei, Liu, Ju, Liu, Weidong, Tu, Jiyuan
We propose a robust and scalable variational Bayes (VB) framework designed to effectively handle contamination and outliers in dataset. Our approach partitions the data into $m$ disjoint subsets and formulates a joint optimization problem based on robust aggregation principles. A key insight is that the full posterior distribution is equivalent to the minimizer of the mean Kullback-Leibler (KL) divergence from the $m$-powered local posterior distributions. To enhance robustness, we replace the mean KL divergence with a min-max median formulation. The min-max formulation not only ensures consistency between the KL minimizer and the Evidence Lower Bound (ELBO) maximizer but also facilitates the establishment of improved statistical rates for the mean of variational posterior. We observe a notable discrepancy in the $m$-powered marginal log likelihood function contingent on the presence of local latent variables. To address this, we treat these two scenarios separately to guarantee the consistency of the aggregated variational posterior. Specifically, when local latent variables are present, we introduce an aggregate-and-rescale strategy. Theoretically, we provide a non-asymptotic analysis of our proposed posterior, incorporating a refined analysis of Bernstein-von Mises (BvM) theorem to accommodate a diverging number of subsets $m$. Our findings indicate that the two-stage approach yields a smaller approximation error compared to directly aggregating the $m$-powered local posteriors. Furthermore, we establish a nearly optimal statistical rate for the mean of the proposed posterior, advancing existing theories related to min-max median estimators. The efficacy of our method is demonstrated through extensive simulation studies.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Appendix for When in Doubt: Neural Non-Parametric Uncertainty Quantification for Epidemic Forecasting Code for E PI FNP and wILI dataset is publicly available
Deep learning is also suitable because it provides the capability of ingesting data from multiple sources, which better informs the model of what is happening on the ground. Our work aims to close this gap in the literature. Existing approaches for uncertainty quantification can be categorized into three lines. The second line tries to combine the stochastic processes and DNNs. The third line is based on model ensembling [24] which trains multiple DNNs with different initializations and use their predictions for uncertainty quantification.
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Health & Medicine > Epidemiology (1.00)
Appendix for When in Doubt: Neural Non-Parametric Uncertainty Quantification for Epidemic Forecasting Code for E PI FNP and wILI dataset is publicly available
Deep learning is also suitable because it provides the capability of ingesting data from multiple sources, which better informs the model of what is happening on the ground. Our work aims to close this gap in the literature. Existing approaches for uncertainty quantification can be categorized into three lines. The second line tries to combine the stochastic processes and DNNs. The third line is based on model ensembling [24] which trains multiple DNNs with different initializations and use their predictions for uncertainty quantification.
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Health & Medicine > Epidemiology (1.00)
Variational Inference for Latent Variable Models in High Dimensions
Zhong, Chenyang, Mukherjee, Sumit, Sen, Bodhisattva
In modern applications, these models typically involve a large number of parameters and latent variables, resulting in complex and high-dimensional posteriors that are computationally intractable. For such scenarios, traditional Markov chain Monte Carlo (MCMC) approaches often suffer from lengthy burn-in periods and generally lack scalability [11]. Recently, variational inference (VI) [31, 10, 52, 11] has emerged as a popular and scalable alternative method for approximating intractable posterior distributions in large-scale applications (where the number of observations and dimensionality are both large) and is typically orders of magnitude faster than MCMC methods. Among the various forms of VI, arguably the most widely used and important is mean-field variational inference (MFVI) [52, 11], which approximates the intractable posterior by a product distribution. This approach has been widely adopted in statistics and machine learning, thanks to efficient algorithmic implementations based on coordinate ascent variational inference (CAVI) [10, 11, 19, 7, 5, 36, 14, 34].
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > Romania > Vest Development Region > Timiș County (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)
Reparameterized Variational Rejection Sampling
Traditional approaches to variational inference rely on parametric families of variational distributions, with the choice of family playing a critical role in determining the accuracy of the resulting posterior approximation. Simple mean-field families often lead to poor approximations, while rich families of distributions like normalizing flows can be difficult to optimize and usually do not incorporate the known structure of the target distribution due to their black-box nature. To expand the space of flexible variational families, we revisit Variational Rejection Sampling (VRS) [Grover et al., 2018], which combines a parametric proposal distribution with rejection sampling to define a rich non-parametric family of distributions that explicitly utilizes the known target distribution. By introducing a low-variance reparameterized gradient estimator for the parameters of the proposal distribution, we make VRS an attractive inference strategy for models with continuous latent variables. We argue theoretically and demonstrate empirically that the resulting method--Reparameterized Variational Rejection Sampling (RVRS)--offers an attractive trade-off between computational cost and inference fidelity. In experiments we show that our method performs well in practice and that it is well-suited for black-box inference, especially for models with local latent variables.
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > United States > Massachusetts > Middlesex County > Somerville (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Doubly Stochastic Variational Inference for Neural Processes with Hierarchical Latent Variables
Neural processes (NPs) constitute a family of variational approximate models for stochastic processes with promising properties in computational efficiency and uncertainty quantification. These processes use neural networks with latent variable inputs to induce predictive distributions. However, the expressiveness of vanilla NPs is limited as they only use a global latent variable, while target specific local variation may be crucial sometimes. To address this challenge, we investigate NPs systematically and present a new variant of NP model that we call Doubly Stochastic Variational Neural Process (DSVNP). This model combines the global latent variable and local latent variables for prediction. We evaluate this model in several experiments, and our results demonstrate competitive prediction performance in multi-output regression and uncertainty estimation in classification.
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > China (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)